Google Page Entity Extraction Template

工作流概述

这是一个包含6个节点的中等工作流,主要用于自动化处理各种任务。

工作流源代码

下载
{
  "id": "4wPgPbxtojrUO7Dx",
  "meta": {
    "instanceId": "f46651348590f9c7e3e7fe91218ed49590c553ab737d5cc247951397ff85fa93"
  },
  "name": "Google Page Entity Extraction Template",
  "tags": [
    {
      "id": "hBkrfz3jN0GbUgJa",
      "name": "Google Page Entity Extraction Template",
      "createdAt": "2025-05-08T23:29:39.011Z",
      "updatedAt": "2025-05-08T23:29:39.011Z"
    }
  ],
  "nodes": [
    {
      "id": "8719f1de-2a3e-4c34-9edc-e4b8f993b525",
      "name": "Respond to Webhook",
      "type": "n8n-nodes-base.respondToWebhook",
      "position": [
        1240,
        -420
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 1.1
    },
    {
      "id": "01420fd5-3483-4e74-b9fc-971199898449",
      "name": "Google Entities",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        1020,
        -420
      ],
      "parameters": {
        "url": "https://language.googleapis.com/v1/documents:analyzeEntities",
        "method": "POST",
        "options": {},
        "jsonBody": "={{ $json.apiRequest }}",
        "sendBody": true,
        "sendQuery": true,
        "sendHeaders": true,
        "specifyBody": "json",
        "queryParameters": {
          "parameters": [
            {
              "name": "key",
              "value": "YOUR-GOOGLE-API-KEY"
            }
          ]
        },
        "headerParameters": {
          "parameters": [
            {
              "name": "Content-Type",
              "value": "application/json"
            }
          ]
        }
      },
      "typeVersion": 4.2
    },
    {
      "id": "5c1c258a-44ed-4d5a-a22d-cddb4df09018",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -300,
        -700
      ],
      "parameters": {
        "color": 4,
        "width": 620,
        "height": 880,
        "content": "# Google Page Entity Extraction Template

## What this workflow does
This workflow allows you to extract named entities (people, organizations, locations, etc.) from any web page using Google's Natural Language API. Simply send a URL to the webhook endpoint, and the workflow will fetch the page content, process it through Google's entity recognition service, and return the structured entity data.

### How to use
1. Replace \"YOUR-GOOGLE-API-KEY\" with your actual Google Cloud API key (Natural Language API must be enabled)
2. Activate the workflow and use the webhook URL as your endpoint
3. Send a POST request to the webhook with a JSON body containing the URL you want to analyze: {\"url\": \"https://example.com/page\"}
4. Review the returned entity analysis with categories, salience scores, and metadata

## Webhook Input Format
The webhook expects a POST request with a JSON body in this format:
```json
{
  \"url\": \"https://website-to-analyze.com/page\"
}
```
### Response Format
The webhook returns a JSON response containing the full entity analysis from Google's Natural Language API, including:

Entity names and types (PERSON, LOCATION, ORGANIZATION, etc.)
Salience scores indicating entity importance
Metadata and mentions within the text
Entity sentiment (if available)"
      },
      "typeVersion": 1
    },
    {
      "id": "79add9a7-adca-4ce5-8a6a-5fcb75288846",
      "name": "Get Url",
      "type": "n8n-nodes-base.webhook",
      "position": [
        360,
        -420
      ],
      "webhookId": "2944c8f6-03cd-4ab8-8b8e-cb033edf877a",
      "parameters": {
        "path": "2944c8f6-03cd-4ab8-8b8e-cb033edf877a",
        "options": {},
        "httpMethod": "POST",
        "responseMode": "responseNode"
      },
      "typeVersion": 2
    },
    {
      "id": "081a52bc-2da7-44fb-bdc3-4cb73cbf8dd3",
      "name": "Get URL Page Contents",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        580,
        -420
      ],
      "parameters": {
        "url": "={{ $json.body.url }}",
        "options": {}
      },
      "typeVersion": 4.2
    },
    {
      "id": "dda5ef3d-f031-4dd6-b117-c1f69aa66b63",
      "name": "Respond with detected entities",
      "type": "n8n-nodes-base.code",
      "position": [
        800,
        -420
      ],
      "parameters": {
        "jsCode": "// Clean and prepare HTML for API request
const html = $input.item.json.data;
// Trim if too large (optional)
const trimmedHtml = html.length > 100000 ? html.substring(0, 100000) : html;

return {
  json: {
    apiRequest: {
      document: {
        type: \"HTML\",
        content: trimmedHtml
      },
      encodingType: \"UTF8\"
    }
  }
}"
      },
      "typeVersion": 2
    }
  ],
  "active": false,
  "pinData": {},
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "432203af-190a-4a89-81d8-f86682a0b63f",
  "connections": {
    "Get Url": {
      "main": [
        [
          {
            "node": "Get URL Page Contents",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Google Entities": {
      "main": [
        [
          {
            "node": "Respond to Webhook",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Get URL Page Contents": {
      "main": [
        [
          {
            "node": "Respond with detected entities",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Respond with detected entities": {
      "main": [
        [
          {
            "node": "Google Entities",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

功能特点

  • 自动检测新邮件
  • AI智能内容分析
  • 自定义分类规则
  • 批量处理能力
  • 详细的处理日志

技术分析

节点类型及作用

  • Respondtowebhook
  • Httprequest
  • Stickynote
  • Webhook
  • Code

复杂度评估

配置难度:
★★★☆☆
维护难度:
★★☆☆☆
扩展性:
★★★★☆

实施指南

前置条件

  • 有效的Gmail账户
  • n8n平台访问权限
  • Google API凭证
  • AI分类服务订阅

配置步骤

  1. 在n8n中导入工作流JSON文件
  2. 配置Gmail节点的认证信息
  3. 设置AI分类器的API密钥
  4. 自定义分类规则和标签映射
  5. 测试工作流执行
  6. 配置定时触发器(可选)

关键参数

参数名称 默认值 说明
maxEmails 50 单次处理的最大邮件数量
confidenceThreshold 0.8 分类置信度阈值
autoLabel true 是否自动添加标签

最佳实践

优化建议

  • 定期更新AI分类模型以提高准确性
  • 根据邮件量调整处理批次大小
  • 设置合理的分类置信度阈值
  • 定期清理过期的分类规则

安全注意事项

  • 妥善保管API密钥和认证信息
  • 限制工作流的访问权限
  • 定期审查处理日志
  • 启用双因素认证保护Gmail账户

性能优化

  • 使用增量处理减少重复工作
  • 缓存频繁访问的数据
  • 并行处理多个邮件分类任务
  • 监控系统资源使用情况

故障排除

常见问题

邮件未被正确分类

检查AI分类器的置信度阈值设置,适当降低阈值或更新训练数据。

Gmail认证失败

确认Google API凭证有效且具有正确的权限范围,重新进行OAuth授权。

调试技巧

  • 启用详细日志记录查看每个步骤的执行情况
  • 使用测试邮件验证分类逻辑
  • 检查网络连接和API服务状态
  • 逐步执行工作流定位问题节点

错误处理

工作流包含以下错误处理机制:

  • 网络超时自动重试(最多3次)
  • API错误记录和告警
  • 处理失败邮件的隔离机制
  • 异常情况下的回滚操作